Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: union_categorical supports identical categories with ordered #13763

Closed

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Jul 23, 2016

.append / concat can handle ordered Categorical which has the identical categories. However, union_categorical raises TypeError. This PR allows union_categorical to handle the case.

on current mastser

s1 = pd.Series(pd.Categorical([1, 2, 3], ordered=True))
s2 = pd.Series(pd.Categorical([3, 2, 1], categories=[1, 2, 3], ordered=True))
s1.append(s2)

#0    1
#1    2
#2    3
#0    3
#1    2
#2    1
# dtype: category
# Categories (3, int64): [1 < 2 < 3]

pd.types.concat.union_categoricals([s1.values, s2.values])
# TypeError: Can only combine unordered Categoricals

CC: @JanSchulz, @chris-b1

@sinhrks sinhrks added Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode Categorical Categorical Data Type labels Jul 23, 2016
@sinhrks sinhrks added this to the 0.19.0 milestone Jul 23, 2016
@codecov-io
Copy link

codecov-io commented Jul 23, 2016

Current coverage is 85.33% (diff: 100%)

Merging #13763 into master will increase coverage by <.01%

@@             master     #13763   diff @@
==========================================
  Files           141        141          
  Lines         50703      50717    +14   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43267      43281    +14   
  Misses         7436       7436          
  Partials          0          0          

Powered by Codecov. Last update 1367945...9cadc4e

@sinhrks sinhrks force-pushed the union_categoricals_ordered branch from ead9cbb to 7540677 Compare July 23, 2016 13:15
and will raise if any are ordered.
`union_categoricals` only works with:
- unordered categoricals
- ordered categoricals which has the same categories
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has -> have

@chris-b1
Copy link
Contributor

Looks good to me, other than a couple nit-picky doc things.

@sinhrks sinhrks force-pushed the union_categoricals_ordered branch from 7540677 to 0971621 Compare July 23, 2016 13:24
@chris-b1
Copy link
Contributor

I created - sinhrks#1 to also add a sort_categories argument, which is needed for #13406

@jankatins
Copy link
Contributor

In my thinking union_categorical is a "unstricter" (and "uncleaner") variant of the normal categorical handling: where the usual cat handling errors, this should "bend the rules" and succeed. Following this, I thinking union_categorical should handle whatever append can take...

`union_categoricals` only works with unordered categoricals
and will raise if any are ordered.
`union_categoricals` only works with:
- unordered categoricals
Copy link
Contributor

@jankatins jankatins Jul 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should read:

In addition to the "easy" case of combining two categoricals of the same categories and order information (e.g. what you could also append for), union_categoricals only works with unordered categoricals and will raise if any are ordered.

I.e. make it clear that union_categoricals supports something more than append.

@sinhrks sinhrks force-pushed the union_categoricals_ordered branch from 0971621 to 9cadc4e Compare July 25, 2016 22:49
@jreback
Copy link
Contributor

jreback commented Jul 29, 2016

@chris-b1 , @JanSchulz any final comments?

@chris-b1
Copy link
Contributor

No, looks good to me.

On Jul 28, 2016 7:32 PM, "Jeff Reback" [email protected] wrote:

@chris-b1 https://github.com/chris-b1 , @JanSchulz
https://github.com/janschulz any final comments?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#13763 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AB1b_FOSzzp-Gnvwk0SvEXA7FwPCj4fkks5qaUolgaJpZM4JTSrP
.

@jankatins
Copy link
Contributor

LGTM

@jreback jreback closed this in 59f2557 Jul 29, 2016
@jreback
Copy link
Contributor

jreback commented Jul 29, 2016

ty sir!

@sinhrks sinhrks deleted the union_categoricals_ordered branch July 29, 2016 10:37
jreback pushed a commit that referenced this pull request Aug 3, 2016
 - needed for #13406, follow-up to #13763

Author: Chris <[email protected]>
Author: sinhrks <[email protected]>

Closes #13846 from chris-b1/union_categoricals_ordered and squashes the following commits:

3a710f0 [Chris] lint fix
ff0bb5e [Chris] add follow-up PRs to whatsnew
ecb2ae9 [Chris] more tests; handle sorth with ordered
eea1777 [Chris] skip r-esort when possible on fastpath
c559662 [sinhrks] ENH: add sort_categories argument to union_categoricals
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants